AITopics | optimal representation

Collaborating Authors

optimal representation

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

A Sufficient-Statistic Reduction of the Information Bottleneck to a Low-Dimensional Problem

Armstrong, Joss

arXiv.org Machine LearningApr-30-2026

We show that if the conditional distribution p(C | T) factors through a sufficient statistic ϕ(T), then the Information Bottleneck (IB) problem for (T, C) is exactly equivalent to the IB problem for (ϕ(T), C). The reduction is loss-free: it preserves the full IB curve, the Lagrangian optimum at every trade-off parameter \b{eta}, and the optimal representations up to pullback through ϕ. As a result, the computational complexity of solving the IB problem is governed by the dimension of the sufficient statistic rather than the ambient dimension of the source. This identifies an exact structural condition under which the generic IB problem becomes tractable, and gives a formal bridge between the discrete and linear-Gaussian regimes. We then show that the classical Gaussian IB solution of Chechik, Globerson, Tishby and Weiss is an immediate corollary of this reduction, and we state a nonlinear-Gaussian generalisation. A small numerical example illustrates the practical consequence: when a low-dimensional sufficient statistic is available, the exact IB curve can be computed on the reduced problem at a cost determined by the statistic rather than by the ambient source dimension.

artificial intelligence, machine learning, theorem 6, (16 more...)

arXiv.org Machine Learning

2604.26744

Country: North America > United States (0.46)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.70)

Add feedback

Parameters as interacting particles: long time convergence and asymptotic error scaling of neural networks

Neural Information Processing SystemsMar-16-2026, 18:27:36 GMT

The performance of neural networks on high-dimensional data distributions suggests that it may be possible to parameterize a representation of a given high-dimensional function with controllably small errors, potentially outperforming standard interpolation methods. We demonstrate, both theoretically and numerically, that this is indeed the case. We map the parameters of a neural network to a system of particles relaxing with an interaction potential determined by the loss function. We show that in the limit that the number of parameters $n$ is large, the landscape of the mean-squared error becomes convex and the representation error in the function scales as $O(n^{-1})$. In this limit, we prove a dynamical variant of the universal approximation theorem showing that the optimal representation can be attained by stochastic gradient descent, the algorithm ubiquitously used for parameter optimization in machine learning. In the asymptotic regime, we study the fluctuations around the optimal representation and show that they arise at a scale $O(n^{-1})$. These fluctuations in the landscape identify the natural scale for the noise in stochastic gradient descent. Our results apply to both single and multi-layer neural networks, as well as standard kernel methods like radial basis functions.

artificial intelligence, machine learning, proceedings, (8 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Parameters as interacting particles: long time convergence and asymptotic error scaling of neural networks

Grant Rotskoff, Eric Vanden-Eijnden

Neural Information Processing SystemsFeb-12-2026, 08:33:52 GMT

Theperformance ofneural networksonhigh-dimensional datadistributions suggests that it may be possible to parameterize a representation of agiven highdimensional function with controllably small errors, potentially outperforming standard interpolation methods. We demonstrate, both theoretically and numerically, that this is indeed the case. We map the parameters of a neural network to a system of particles relaxing with an interaction potential determined by the lossfunction.

artificial intelligence, arxiv, machine learning, (16 more...)

Neural Information Processing Systems

Country:

Africa > Middle East > Tunisia > Ben Arous Governorate > Ben Arous (0.05)
Oceania > Australia > New South Wales > Sydney (0.04)
North America > Canada > Quebec > Montreal (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

TowardsSample-efficientOverparameterized Meta-learning

Neural Information Processing SystemsFeb-11-2026, 18:47:24 GMT

An overarching goal in machine learning is to build ageneralizable model with fewsamples.

artificial intelligence, machine learning, representation, (16 more...)

Neural Information Processing Systems

Country: Oceania > Australia > New South Wales > Sydney (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

Add feedback

TowardsSample-efficientOverparameterized Meta-learning

Neural Information Processing SystemsFeb-11-2026, 18:47:20 GMT

We then integrate these findings toobtain anoverallperformance guarantee forourmetalearning algorithm.

artificial intelligence, machine learning, representation, (17 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.47)

Add feedback

Bottleneck Structure in Learned Features: Low-Dimension vs Regularity Tradeoff

Neural Information Processing SystemsFeb-11-2026, 10:16:58 GMT

This formalizes a balance between learning low-dimensional representations and minimizing complexity/irregularity in the feature maps, allowing the network to learn the'right' inner dimension.

artificial intelligence, machine learning, representation, (16 more...)

Neural Information Processing Systems

Country: Africa > Middle East > Tunisia > Ben Arous Governorate > Ben Arous (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Regularizedlinearautoencodersrecovertheprincipal components,eventually

Neural Information Processing SystemsFeb-8-2026, 09:17:36 GMT

Our understanding of learning input-output relationships with neural nets has improved rapidly in recent years, but little is known about the convergence of the underlying representations, even in the simple case of linear autoencoders (LAEs).

artificial intelligence, machine learning, representation, (16 more...)

Neural Information Processing Systems

Country:

Oceania > Australia > New South Wales > Sydney (0.04)
North America > Canada > Ontario > Toronto (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.70)

Add feedback

A Geometric Perspective on Optimal Representations for Reinforcement Learning

Neural Information Processing SystemsDec-25-2025, 06:51:36 GMT

We propose a new perspective on representation learning in reinforcement learning based on geometric properties of the space of value functions. From there, we provide formal evidence regarding the usefulness of value functions as auxiliary tasks in reinforcement learning. Our formulation considers adapting the representation to minimize the (linear) approximation of the value function of all stationary policies for a given environment. We show that this optimization reduces to making accurate predictions regarding a special class of value functions which we call adversarial value functions (AVFs). We demonstrate that using value functions as auxiliary tasks corresponds to an expected-error relaxation of our formulation, with AVFs a natural candidate, and identify a close relationship with proto-value functions (Mahadevan, 2005). We highlight characteristics of AVFs and their usefulness as auxiliary tasks in a series of experiments on the four-room domain.

geometric perspective, optimal representation, value function, (6 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.55)

Add feedback

Contrastive and Non-Contrastive Self-Supervised Learning Recover Global and Local Spectral Embedding Methods

Neural Information Processing SystemsDec-24-2025, 23:13:15 GMT

Self-Supervised Learning (SSL) surmises that inputs and pairwise positive relationships are enough to learn meaningful representations. Although SSL has recently reached a milestone: outperforming supervised methods in many modalities\dots the theoretical foundations are limited, method-specific, and fail to provide principled design guidelines to practitioners. In this paper, we propose a unifying framework under the helm of spectral manifold learning. Through the course of this study, we will demonstrate that VICReg, SimCLR, BarlowTwins et al. correspond to eponymous spectral methods such as Laplacian Eigenmaps, ISOMAP et al.From this unified viewpoint, we obtain (i) the close-form optimal representation, (ii) the close-form optimal network parameters in the linear regime, (iii) the impact of the pairwise relations used during training on each of those quantities and on downstream task performances, and most importantly, (iv) the first theoretical bridge between contrastive and non-contrastive methods to global and local spectral methods respectively hinting at the benefits and limitations of each. For example, if the pairwise relation is aligned with the downstream task, all SSL methods produce optimal representations for that downstream task.

local spectral embedding method, name change, representation, (4 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Filters

Collaborating Authors

optimal representation

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

A Sufficient-Statistic Reduction of the Information Bottleneck to a Low-Dimensional Problem

Parameters as interacting particles: long time convergence and asymptotic error scaling of neural networks

Parameters as interacting particles: long time convergence and asymptotic error scaling of neural networks

TowardsSample-efficientOverparameterized Meta-learning

TowardsSample-efficientOverparameterized Meta-learning

Bottleneck Structure in Learned Features: Low-Dimension vs Regularity Tradeoff

3fbcfbc2b4009ae8dfa17a562532d123-Supplemental-Conference.pdf

Regularizedlinearautoencodersrecovertheprincipal components,eventually

A Geometric Perspective on Optimal Representations for Reinforcement Learning

Contrastive and Non-Contrastive Self-Supervised Learning Recover Global and Local Spectral Embedding Methods